UniNE at CLEF 2016: Author Profiling

نویسندگان

  • Mirco Kocher
  • Jacques Savoy
چکیده

This paper describes and evaluates an author profiling model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different Indo-European languages (such as Dutch, English, and Spanish). As features, we suggest using the m most frequent terms of the query text (isolated words and punctuation symbols with m at most 200). Applying a simple distance measure and looking at the five nearest neighbors, we can determine the gender (with the nominal values “male” or “female”) and the age group (with the ordinal measurement 18-24 | 25-34 | 35-49 | 50-64 | >65). While the labeled data is available for Twitter tweets, the evaluations are based on three test collections from an unknown different genre (blogs, reviews, social media, ...) (PAN AUTHOR PROFILING task at CLEF 2016).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UniNE at CLEF 2015 Author Profiling: Notebook for PAN at CLEF 2015

This paper describes and evaluates an effective author profiling model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, Italian, and Spanish) in Twitter tweets. As features, we suggest using the 200 most frequent terms of the query text (isolated words and punctuation symbols). Applying a simple distance measure and loo...

متن کامل

UniNE at CLEF 2017: Author Profiling Reasoning

This paper describes and evaluates a supervised author profiling model. The suggested strategy can be adapted without any problem to various languages (such as Arabic, English, Spanish, and Portuguese). As features, we suggest using the m most frequent terms of the query text (isolated words and punctuation symbols with m at most 200). Applying a simple distance measure and looking at the neare...

متن کامل

UniNE at CLEF 2017: TF-IDF and Deep-Learning for Author Profiling

This paper describes and evaluates a strategy for author profiling using TF-IDF and a Deep-Learning model based on Convolutional Neural Networks. We applied this strategy to the author profiling task of the PAN17 challenge and show that it can be applied to different languages (English, Spanish, Portuguese and Arabic). As features, we suggest using a simple cleaning method for both models, and ...

متن کامل

UniNE at CLEF 2016: Author Clustering

This paper describes and evaluates an effective unsupervised author clustering authorship linking model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, and Greek) in different genres (e.g., newspaper articles and reviews). As features, we suggest using the m most frequent terms of each text (isolated words and punctuat...

متن کامل

UniNE at CLEF 2015 Author Identification: Notebook for PAN at CLEF 2015

This paper describes and evaluates an unsupervised authorship verification model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, Greek, and Spanish) with their genre and topic differ significantly. As features, we suggest using the k most frequent terms of the disputed text (isolated words and punctuation symbols with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016